Adding noop implementations for Sql persistence#5318
Draft
johnsimons wants to merge 76 commits intomasterfrom
Draft
Adding noop implementations for Sql persistence#5318johnsimons wants to merge 76 commits intomasterfrom
johnsimons wants to merge 76 commits intomasterfrom
Conversation
Refactors the upsert logic in several data stores to leverage EF Core's change tracking more efficiently. Instead of creating a new entity and then calling Update, the code now fetches the existing entity (if any) and modifies its properties directly. This reduces the overhead and potential issues associated with detached entities. The RecoverabilityIngestionUnitOfWork is also updated to use change tracking for FailedMessageEntity updates. This commit was made on the `john/more_interfaces` branch.
Adds data store and entities required for persisting licensing and throughput data. This includes adding new tables for licensing metadata, throughput endpoints, and daily throughput data, as well as configurations and a data store implementation to interact with these tables.
Also added headers to the serialised entity
Updates data stores to utilize IServiceScopeFactory instead of IServiceProvider for creating database scopes. This change improves dependency injection and resource management, ensuring proper scope lifecycle management, especially for asynchronous operations.
Adds full-text search capabilities for error messages, allowing users to search within message headers and, optionally, the message body. Introduces an interface for full-text search providers to abstract the database-specific implementation. Stores small message bodies inline for faster retrieval and populates a searchable text field from headers and the message body. Adds configuration option to set the maximum body size to store inline.
Removes the `internal` keyword from the `RecoverabilityJsonContext` class. This change allows the class to be accessible from other assemblies, potentially needed for serialization/deserialization scenarios outside the current assembly.
Stores message bodies to disk in parallel to improve ingestion performance. Instead of awaiting the completion of each write operation, it queues them, allowing multiple write tasks to run concurrently. It then awaits all tasks before saving the changes to the database.
Updates the configuration to no longer default the message body storage path to a location under `CommonApplicationData`. The path will now be empty by default. This change allows users to explicitly configure the storage location, preventing potential issues with default locations.
Refactors the Azure Blob Storage persistence to streamline its configuration. It removes the direct instantiation of BlobContainerClient within the base class and instead, registers the AzureBlobBodyStoragePersistence class for dependency injection, allowing the constructor to handle the BlobContainerClient creation. Additionally, it ensures that the ContentType metadata stored in Azure Blob Storage is properly encoded and decoded to handle special characters. Also, it adds MessageBodyStorageConnectionStringKey to the configuration keys for both PostgreSQL and SQL Server.
Implements data retention policy for audit messages and saga snapshots using a background service. This change introduces a base `RetentionCleaner` class that handles the logic for deleting expired audit data in batches. Database-specific implementations are provided for SQL Server and PostgreSQL, leveraging their respective locking mechanisms (sp_getapplock and advisory locks) to prevent concurrent executions of the cleanup process. Removes the registration of the `RetentionCleaner` from the base class and registers it on specific implementations. The cleanup process deletes processed messages and saga snapshots older than the configured retention period, optimizing database space and improving query performance.
Wraps retention cleanup process in an execution strategy to handle transient database errors. Moves lock check to inside the execution strategy, and only logs success if the lock was acquired.
Resets the total deleted messages and snapshots counters, as well as the lockAcquired flag, on each retry attempt of the retention cleaner process. This prevents accumulation of values across retries when the execution strategy is used. Also, updates lock acquisition logic to use `AsAsyncEnumerable()` to prevent errors caused by non-composable SQL in `SqlQueryRaw` calls.
Adds metrics to monitor the retention cleanup process. This includes metrics for cleanup cycle duration, batch duration, deleted messages, skipped locks, and consecutive failures. These metrics provide insights into the performance and health of the retention cleanup process, allowing for better monitoring and troubleshooting.
Introduces ingestion throttling during retention cleanup to reduce contention. This change adds an `IngestionThrottleState` to manage the throttling. The retention cleaner now signals when cleanup starts and ends, and the audit ingestion process respects the current writer limit. A new `RetentionCleanupBatchDelay` setting is introduced to add a delay between processing batches of messages. Adds a capacity metric to monitor the current ingestion capacity.
Corrects an issue where endpoint reconciliation could lead to incorrect "LastSeen" values when endpoints are deleted and re-added. The previous implementation aggregated LastSeen values across all deleted records, potentially resulting in an outdated value being used. This change introduces a ranking mechanism to select the most recent LastSeen value for each KnownEndpointId during reconciliation. This ensures that the latest LastSeen value is used, improving the accuracy of endpoint activity tracking.
Ensures distinct message IDs are deleted during retention cleanup. Adjusts the loop condition to continue deleting messages as long as the number of deleted items is greater than or equal to the batch size. This prevents premature termination of the cleanup process when a batch returns exactly the batch size, ensuring all eligible messages are removed.
Refactors the audit retention cleanup process to ensure reliability and prevent race conditions. It achieves this by: - Using session-level locks to maintain lock ownership across transactions, preventing premature lock release. - Encapsulating the entire cleanup process within a single lock, simplifying retry logic and ensuring all operations are executed by the same instance. - Wrapping each batch deletion in its own execution strategy and transaction to handle transient errors and maintain data consistency.
Implements table partitioning based on the ProcessedAt timestamp. This change introduces table partitioning for both ProcessedMessages and SagaSnapshots tables to improve retention cleanup performance and manageability. Additionally, the change stores message bodies in date-based folders. Removes progressive ingestion throttling.
Prevents potential modification of the connection string by storing it in a read-only field. This enhances thread safety and data integrity, especially in concurrent scenarios like retention cleanup.
Adds a unique non-clustered index on the Id column of the ProcessedMessages table to be used as key index for the full-text index. Updates the creation of the full-text index on ProcessedMessages to use the newly created index. This change is required because the full-text index needs a unique key index.
The retention period is rounded up to the nearest whole day to ensure that partitioning by day functions correctly and that no audit data is prematurely deleted. Since partitions are daily, only whole days should be used for retention period calculations.
Changes the partitioning scheme from daily to hourly. This provides more granular data management and improves query performance. The `ProcessedAt` timestamp is replaced with `CreatedOn`, truncated to the hour, as the partition key in both `ProcessedMessages` and `SagaSnapshots` tables. The body storage is also changed to store data in hourly folders.
Replaces partition truncation with a switch operation for greater compatibility with indexes. Aligns the `TimeSent` index with the partition scheme to improve query performance via partition elimination. Increases the database command timeout during migration to prevent failures on large databases.
Ensures that the staging table has a matching clustered index. This is a requirement for the `SWITCH PARTITION` operation to work correctly. The `SELECT INTO` statement copies columns but not indexes, so the index must be created explicitly.
Implements hourly partitioning for processed messages and saga snapshots in SQL Server to improve query performance and manage data retention. Introduces a partition function and scheme based on the 'CreatedOn' timestamp, and migrates existing tables onto this scheme. The partition manager handles boundary creation at runtime. Also, full-text search capabilities are added to the searchable content column for PostgreSQL and SQL Server. The staging table approach for partition truncation is replaced with a direct DELETE statement using partition elimination for SQL Server.
Increases the command timeout to 5 minutes for partition management operations. This change addresses potential timeout issues during partition cleanup, especially with the introduction of smaller, more frequent partitions. Long running partition delete operations might exceed the default timeout.
5394c0d to
d894e31
Compare
Ensures data ingestion is paused during retention cleanup to prevent conflicts and data corruption. Sets a flag to prevent more data ingestion while partitions are being dropped to avoid race conditions.
d894e31 to
a2e8327
Compare
Improves performance of hourly partition deletion by batching the deletion of records. This avoids potential timeouts and increases the efficiency of deleting large numbers of records.
Removes the 'CreatedOn' column from the composite indexes on the ProcessedMessages and SagaSnapshots tables. This change centralizes the partitioning logic within the database itself, leading to improved query performance and simplified index management.
Ensures correct handling of nullable DateTime values returned from the SQL query, preventing potential errors when determining the oldest hour with data. This change allows the system to gracefully handle cases where the database might not have any data, returning a null DateTime value which is then processed correctly.
Adds indexes to the CreatedOn column in the ProcessedMessages and SagaSnapshots tables. SQL Server doesn't use native partitioning. These indexes improve query performance when using retention policies. Specifically addresses the MIN() queries used by the retention cleaner.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.